AritPIM: High-Throughput In-Memory Arithmetic

نویسندگان

چکیده

Digital processing-in-memory (PIM) architectures are rapidly emerging to overcome the memory-wall bottleneck by integrating logic within memory elements. Such provide vast computational power itself in form of parallel bitwise operations. We develop novel algorithmic techniques for PIM that, combined with new perspectives on computer arithmetic, extend this parallelism four fundamental arithmetic operations (addition, subtraction, multiplication, and division), both fixed-point floating-point numbers, using bit-serial bit-parallel approaches. propose a state-of-the-art suite algorithms, demonstrating first algorithm literature digital majority cases – including previously considered impossible PIM, such as addition. Through case study memristive we compare proposed algorithms an NVIDIA RTX 3070 GPU demonstrate significant throughput energy improvements.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Memory Efficient Arithmetic

In this paper we give an algorithm for finding the mth base-b digit of a positive integer n (m = 1 is the least significant digit) defined as the final number in a sequence of integers gotten by multiplying, adding, and subtracting previous numbers in the sequence (actually, the algorithm finds arbitrarily precise approximations to n/bm (mod 1), which can be used to get this mth digit whenever ...

متن کامل

High-Throughput, Low-Memory Applications on the Pica Architecture

This paper introduces Pica, a fine-grain, message passing architecture designed to efficiently support high-throughput parallel applications. This focus on high-throughput applications allows a small local memory of 4096 36-bit words. The architecture minimizes overhead for basic parallel operations. An operand-addressed context cache and round-robin task manager allow single cycle task swaps. ...

متن کامل

High-Throughput and Memory Efficient LDPC Decoder Architecture

Low-Density Parity-Check (LDPC) code is one kind of prominent error correcting codes (ECC) being considered in next generation industry standards. The decoder implementation complexity has been the bottleneck of its application. This paper presents a new kind of high-throughput and memory efficient LDPC decoder architecture. In general, more than fifty percent of memory can be saved over conven...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Emerging Topics in Computing

سال: 2023

ISSN: ['2168-6750', '2376-4562']

DOI: https://doi.org/10.1109/tetc.2023.3268137